Search CORE

University of Melbourne Institutional Repository

Caltech Authors

Under pressure: Response urgency modulates striatal and insula activity during decision-making under risk

Author: A Bechara
A Tversky
AC Livesey
AD Craig
AR Damasio
B Knutson
BA Reddi
BU Forstmann
BU Forstmann
C Trepel
Catherine L. Jones
CD Fiorillo
CM Kuhnen
D Bernoulli
D Kahneman
EA Crone
G Coricelli
G Xue
H Ben Zur
H Damasio
HD Critchley
HD Critchley
HD Critchley
Hugo D. Critchley
J LeDoux
JA Weller
Jamie Ward
Jan Lauwereyns
JR Busemeyer
K Preuschoff
KG Volz
L Clark
Ludovico Minati
M Fenton-O'Creevy
M Sakagami
M Wittmann
MM Botvinick
MM Bradley
MM Bradley
N Tzourio-Mazoyer
NA Harrison
Neil A. Harrison
NH Naqvi
P Redgrave
P Wright
R Bogacz
RJ Dolan
S Carter
SA Huettel
SD Slotnick
SM Tom
T Singer
TS Braver
V Menon
WL Libby
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

When deciding whether to bet in situations that involve potential monetary loss or gain (mixed gambles), a subjective sense of pressure can influence the evaluation of the expected utility associated with each choice option. Here, we explored how gambling decisions, their psychophysiological and neural counterparts are modulated by an induced sense of urgency to respond. Urgency influenced decision times and evoked heart rate responses, interacting with the expected value of each gamble. Using functional MRI, we observed that this interaction was associated with changes in the activity of the striatum, a critical region for both reward and choice selection, and within the insula, a region implicated as the substrate of affective feelings arising from interoceptive signals which influence motivational behavior. Our findings bridge current psychophysiological and neurobiological models of value representation and action-programming, identifying the striatum and insular cortex as the key substrates of decision-making under risk and urgency

Online Research @ Cardiff

Sussex Research Online

Optogenetic Mimicry of the Transient Activation of Dopamine Neurons by Natural Reward Is Sufficient for Operant Reinforcement

Author: A Routtenberg
AD Redish
AG Phillips
AI Domingos
Aimei Yang
AR Adamantidis
C Spyraki
CC Michaels
CD Fiorillo
CD Fiorillo
Christopher D. Fiorillo
CM Backman
CM Cannon
D Atasoy
D Nakahara
Doheon Lee
Edward S. Boyden
ES Boyden
F Brischoux
F Tecuapetla
G Morris
GD Stuber
H Morikawa
H Nakahara
HC Tsai
HM Bayer
J Olds
JC Horvitz
K Nomoto
KC Berridge
KC Berridge
KL Conover
KW Easterling
Kyung Man Kim
LJ Holmes
LS Zweifel
M Matsumoto
MF Roitman
Michael V. Baratta
MR Roesch
MT Brown
P Redgrave
P Redgrave
PJ Wellman
PN Tobler
PR Montague
R Depoortere
RA Rescorla
RA Wise
RA Wise
RE Keesey
RS Sutton
S Cabeza de Vaca
S Robinson
W Schultz
WX Pan
Xiaoxi Zhuang
Publication venue: Public Library of Science
Publication date: 01/10/2011
Field of study

Activation of dopamine receptors in forebrain regions, for minutes or longer, is known to be sufficient for positive reinforcement of stimuli and actions. However, the firing rate of dopamine neurons is increased for only about 200 milliseconds following natural reward events that are better than expected, a response which has been described as a “reward prediction error” (RPE). Although RPE drives reinforcement learning (RL) in computational models, it has not been possible to directly test whether the transient dopamine signal actually drives RL. Here we have performed optical stimulation of genetically targeted ventral tegmental area (VTA) dopamine neurons expressing Channelrhodopsin-2 (ChR2) in mice. We mimicked the transient activation of dopamine neurons that occurs in response to natural reward by applying a light pulse of 200 ms in VTA. When a single light pulse followed each self-initiated nose poke, it was sufficient in itself to cause operant reinforcement. Furthermore, when optical stimulation was delivered in separate sessions according to a predetermined pattern, it increased locomotion and contralateral rotations, behaviors that are known to result from activation of dopamine neurons. All three of the optically induced operant and locomotor behaviors were tightly correlated with the number of VTA dopamine neurons that expressed ChR2, providing additional evidence that the behavioral responses were caused by activation of dopamine neurons. These results provide strong evidence that the transient activation of dopamine neurons provides a functional reward signal that drives learning, in support of RL theories of dopamine function

CiteSeerX

DSpace@MIT

FigShare

Altered Neural and Behavioral Dynamics in Huntington's Disease: An Entropy Conservation Approach

Author: A Daffertshofer
AJ Mandell
BR Miller
CD Fiorillo
CE Shannon
DS Kern
George V. Rebec
GV Rebec
J Cohen
JAS Kelso
L Mangiarini
MA Hickey
Matjaz Perc
S. Lee Hong
SC Fowler
Scott J. Barton
SL Hong
SL Hong
SL Hong
SM Pincus
W Schultz
WP Smotherman
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Background: Huntington’s disease (HD) is an inherited condition that results in neurodegeneration of the striatum, the forebrain structure that processes cortical information for behavioral output. In the R6/2 transgenic mouse model of HD, striatal neurons exhibit aberrant firing patterns that are coupled with reduced flexibility in the motor system. The aim of this study was to test the patterns of unpredictability in brain and behavior in wild-type (WT) and R6/2 mice. Methodology/Principal Findings: Striatal local field potentials (LFP) were recorded from 18 WT and 17 R6/2 mice (aged 8– 11 weeks) while the mice were exploring a plus-shaped maze. We targeted LFP activity for up to 2 s before and 2 s after each choice-point entry. Approximate Entropy (ApEn) was calculated for LFPs and Shannon Entropy was used to measure the probability of arm choice, as well as the likelihood of making consecutive 90-degree turns in the maze. We found that although the total number of choice-point crossings and entropy of arm-choice probability was similar in both groups, R6/2 mice had more predictable behavioral responses (i.e., were less likely to make 90-degree turns and perform them in alternation with running straight down the same arm), while exhibiting more unpredictable striatal activity, as indicated by higher ApEn values. In both WT and R6/2 mice, however, behavioral unpredictability was negatively correlated with LFP ApEn. Conclusions/Significance: HD results in a perseverative exploration of the environment, occurring in concert with mor

CiteSeerX

Prolonged dopamine signalling in striatum signals proximity and value of distant rewards

Author: AA Braun
CD Fiorillo
D Derdikman
HM Bayer
IQ Whishaw
JD Salamone
JD Salamone
JJ Clark
JJ Day
JJ Day
JO Gan
JY Cohen
KC Berridge
LP Wang
LS Zweifel
M Matsumoto
O Hikosaka
P Redgrave
P Waelti
PE Phillips
PN Tobler
RB Keithley
SB Flagel
TD Barnes
W Schultz
W Schultz
Y Niv
Y Niv
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Predictions about future rewarding events have a powerful influence on behaviour. The phasic spike activity of dopamine-containing neurons, and corresponding dopamine transients in the striatum, are thought to underlie these predictions, encoding positive and negative reward prediction errors. However, many behaviours are directed towards distant goals, for which transient signals may fail to provide sustained drive. Here we report an extended mode of reward-predictive dopamine signalling in the striatum that emerged as rats moved towards distant goals. These dopamine signals, which were detected with fast-scan cyclic voltammetry (FSCV), gradually increased or—in rare instances—decreased as the animals navigated mazes to reach remote rewards, rather than having phasic or steady tonic profiles. These dopamine increases (ramps) scaled flexibly with both the distance and size of the rewards. During learning, these dopamine signals showed spatial preferences for goals in different locations and readily changed in magnitude to reflect changing values of the distant rewards. Such prolonged dopamine signalling could provide sustained motivational drive, a control mechanism that may be important for normal behaviour and that can be impaired in a range of neurologic and neuropsychiatric disorders.National Institutes of Health (U.S.) (Grant R01 MH060379)National Parkinson Foundation (U.S.)Cure Huntington’s Disease Initiative, Inc. (Grant A-5552)Stanley H. and Sheila G. Sydney Fun

DSpace@MIT

University of Birmingham Research Portal

Two spatiotemporally distinct value systems shape reward-based learning in the human brain

Author: B Blankertz
B Knutson
B Seymour
C Amiez
CD Fiorillo
CD Fiorillo
DR Gitelman
E Payzan-LeNestour
ES Bromberg-Martin
F Cauda
G Hajcak
HD Critchley
I Heitland
J O’Doherty
J Spicer
J Zackheim
JA Clithero
JB Ding
JG Kerns
JM Walz
KE Stephan
KJ Friston
KJ Friston
KJ Friston
KJ Mullinger
KJ Mullinger
KR Ridderinkhof
KS Taylor
L Parra
LA Bradfield
LC Parra
M Debettencourt
M Jenkinson
MG Philiastides
MG Philiastides
MG Philiastides
MJ Frank
MJ Frank
MW Woolrich
MX Cohen
MX Cohen
N Yeung
P Dayan
P Sajda
PN Tobler
R Elliott
RA Wise
RI Goldman
RY Moore
S Bouret
S Gherman
S Kobayashi
S Seifert
SJ Sara
SM Smith
T Hikida
T Minamimoto
TD Wager
TJ Ellender
W Schultz
WM Cowan
X Liu
Y Smith
Y Wu
Y-L Boureau
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2015
Field of study

Avoiding repeated mistakes and learning to reinforce rewarding decisions is critical for human survival and adaptive actions. Yet, the neural underpinnings of the value systems that encode different decision-outcomes remain elusive. Here coupling single-trial electroencephalography with simultaneously acquired functional magnetic resonance imaging, we uncover the spatiotemporal dynamics of two separate but interacting value systems encoding decision-outcomes. Consistent with a role in regulating alertness and switching behaviours, an early system is activated only by negative outcomes and engages arousal-related and motor-preparatory brain structures. Consistent with a role in reward-based learning, a later system differentially suppresses or activates regions of the human reward network in response to negative and positive outcomes, respectively. Following negative outcomes, the early system interacts and downregulates the late system, through a thalamic interaction with the ventral striatum. Critically, the strength of this coupling predicts participants’ switching behaviour and avoidance learning, directly implicating the thalamostriatal pathway in reward-based learning

Plymouth Electronic Archive and Research Library

Repository@Nottingham

University of Huddersfield Repository

Huddersfield Research Portal

Subjective utility moderates bidirectional effects of conflicting motivations on pain perception

Author: A Lak
AM Brooks
AV Apkarian
CD Fiorillo
D Talmi
F Faul
FD Sheffield
H Fields
HL Fields
J Cohen
J Dum
JD Cameron
JJ Marbach
JJ Marbach
M Cabanac
M Cabanac
MGS Schrooten
N Claes
N Claes
NJ Buckland
P Rainville
P Vázquez-Borsetti
S Becker
S Becker
S Leknes
SQ Park
SW Wu
T Pincus
W Schultz
Y Guo
YA Kurnianingsih
YK Takahashi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/08/2017
Field of study

Minimizing pain and maximizing pleasure are conflicting motivations when pain and reward co-occur. Decisions to prioritize reward consumption or pain avoidance are assumed to lead to pain inhibition or facilitation, respectively. Such decisions are a function of the subjective utility of the stimuli involved, i.e. the relative value assigned to the stimuli to compare the potential outcomes of a decision. To test perceptual pain modulation by varying degrees of motivational conflicts and the role of subjective utility, we implemented a task in which healthy volunteers had to decide between accepting a reward at the cost of receiving a nociceptive electrocutaneous stimulus or rejecting both. Subjective utility of the stimuli was assessed by a matching task between the stimuli. Accepting reward coupled to a nociceptive stimulus resulted in decreased perceived intensity, while rejecting the reward to avoid pain resulted in increased perceived intensity, but in both cases only if a high motivational conflict was present. Subjective utility of the stimuli involved moderated these bidirectional perceptual effects: the more a person valued money over pain, the more perceived intensity increased or decreased. These findings demonstrate pain modulation when pain and reward are simultaneously present and highlight the importance of subjective utility for such modulation

Central Archive at the University of Reading

ZORA

From uncertainty to reward: BOLD characteristics differentiate signaling pathways

Author: AA Grace
B Abler
B Abler
B Knutson
B Knutson
B Knutson
B Knutson
B Knutson
BA Strange
Birgit Abler
Bärbel Herrnberger
C-F Huang
CD Fiorillo
DJ Heeger
DJ Lodge
DJ Lodge
G Xue
Georg Grön
J von Neumann
JC Dreher
JK Choi
K Preuschoff
M Hsu
Manfred Spitzer
PN Tobler
RG Schlosser
RJ Dolan
SB Floresco
TA Klein
W Schultz
W Schultz
Publication venue: BioMed Central
Publication date: 01/12/2009
Field of study

Abstract Background Reward value and uncertainty are represented by dopamine neurons in monkeys by distinct phasic and tonic firing rates. Knowledge about the underlying differential dopaminergic pathways is crucial for a better understanding of dopamine-related processes. Using functional magnetic resonance blood-oxygen level dependent (BOLD) imaging we analyzed brain activation in 15 healthy, male subjects performing a gambling task, upon expectation of potential monetary rewards at different reward values and levels of uncertainty. Results Consistent with previous studies, ventral striatal activation was related to both reward magnitudes and values. Activation in medial and lateral orbitofrontal brain areas was best predicted by reward uncertainty. Moreover, late BOLD responses relative to trial onset were due to expectation of different reward values and likely to represent phasic dopaminergic signaling. Early BOLD responses were due to different levels of reward uncertainty and likely to represent tonic dopaminergic signals. Conclusions We conclude that differential dopaminergic signaling as revealed in animal studies is not only represented locally by involvement of distinct brain regions but also by distinct BOLD signal characteristics.</p

Springer - Publisher Connector

University of Birmingham Research Portal

Spatiotemporal neural characterization of prediction error valence and surprise during reward learning in humans

Author: A Caplin
A Eklund
A Litt
AGE Collins
AJ Yu
B Knutson
B Seymour
BKH Chau
BY Hayden
CC Ruff
CD Fiorillo
CD Fiorillo
CF Zink
DR Bach
E Metereau
H Kim
HE Atallah
HEM Ouden den
J Gläscher
J Gläscher
J Jensen
J O’Doherty
J O’Doherty
JK Kruschke
JM Pearce
JM Walz
JN Keynan
JS Ide
JT McGuire
K D’Ardenne
K Foerde
K Preuschoff
K Wunderlich
KJ Friston
KJ Mullinger
KJ Mullinger
KT Kishida
LC Parra
LK Krugel
LP Sugrue
M Matsumoto
M Matsumoto
M-P Stenner
MC Dorris
MFS Rushworth
MG Philiastides
MG Philiastides
MG Philiastides
MG Philiastides
MG Philiastides
MJ Frank
MM Plichta
MR Delgado
MR Delgado
MR Roesch
N Kolling
ND Daw
P Bossaerts
P Dayan
P Dayan
PR Montague
R Akaishi
RB Rutledge
RI Goldman
S Gherman
S Iglesias
S Ikemoto
S Rudorf
S-L Lim
SM Smith
T Kahnt
T Nichols
TD Sambrook
TEJ Behrens
W Schultz
W Schultz
WF Asaad
Y Niv
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Reward learning depends on accurate reward associations with potential choices. These associations can be attained with reinforcement learning mechanisms using a reward prediction error (RPE) signal (the difference between actual and expected rewards) for updating future reward expectations. Despite an extensive body of literature on the influence of RPE on learning, little has been done to investigate the potentially separate contributions of RPE valence (positive or negative) and surprise (absolute degree of deviation from expectations). Here, we coupled single-trial electroencephalography with simultaneously acquired fMRI, during a probabilistic reversal-learning task, to offer evidence of temporally overlapping but largely distinct spatial representations of RPE valence and surprise. Electrophysiological variability in RPE valence correlated with activity in regions of the human reward network promoting approach or avoidance learning. Electrophysiological variability in RPE surprise correlated primarily with activity in regions of the human attentional network controlling the speed of learning. Crucially, despite the largely separate spatial extend of these representations our EEG-informed fMRI approach uniquely revealed a linear superposition of the two RPE components in a smaller network encompassing visuo mnemonic and reward areas. Activity in this network was further predictive of stimulus value updating indicating a comparable contribution of both signals to reward learning

Plymouth Electronic Archive and Research Library

Repository@Nottingham

Oxford University Research Archive

University of Huddersfield Repository

Huddersfield Research Portal

Temporal-Difference Reinforcement Learning with Distributed Representations

Author: A Johnson
A Johnson
A Kacelnik
A. David Redish
AD Redish
AD Redish
AD Redish
AG Barto
AG Sanfey
AL Odum
AM Graybiel
AV Beylin
B Reynolds
CD Fiorillo
CD Fiorillo
CD Fiorillo
CR Gallistel
D Read
D Self
DC Rubin
DC Rubin
DI Laibson
DW Stephens
E Pastalkova
EA Ludvig
EA Ludvig
F Wörgötter
G Ainslie
G Ainslie
G Ainslie
G Thibaudeau
GD Stuber
GE Alexander
GE Alexander
GJ Madden
HM Bayer
HM Bayer
I Pavlov
J Gibbon
J Mazur
J Mirenowicz
J Mirenowicz
JC Jackson
JE Mazur
JER Staddon
JF Cheer
JJ Day
JP O'Doherty
JP O'Doherty
JR Hollerman
JR Norris
K Doya
K Doya
K Doya
K Doya
K Samejima
K Samejima
M Bertin
M Kawato
MF Roitman
N Schweighofer
N Schweighofer
N Schweighofer
ND Daw
ND Daw
ND Daw
ND Daw
NJ Mackintosh
NM Petry
Olaf Sporns
P Brémaud
P Dayan
P Dayan
PD Sozou
PEM Phillips
PL Strick
PR Montague
PR Solomon
PS Kaplan
R Bellman
RA Rescorla
RE Suri
RE Suri
RE Vuchinich
RJ Herrnstein
RM Wightman
RN Cardinal
RS Sutton
RS Sutton
RS Zemel
S Kakade
SC Tanaka
SC Tanaka
SH Mitchell
SJ Badtke
SM Alessi
SM McClure
SN Haber
T Das
T Kalenscher
T Ljungberg
TJ Shors
W Schultz
W Schultz
W Schultz
W Schultz
W Schultz
W Schultz
WB Levy
WB Levy
WX Pan
Y Niv
Zeb Kurth-Nelson
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these issues in the context of a TD RL model in which state-belief is distributed over a set of exponentially-discounting “micro-Agents”, each of which has a separate discounting factor (γ). Each µAgent maintains an independent hypothesis about the state of the world, and a separate value-estimate of taking actions within that hypothesized state. The overall agent thus instantiates a flexible representation of an evolving world-state. As with other TD models, the value-error (δ) signal within the model matches dopamine signals recorded from animals in standard conditioning reward-paradigms. The distributed representation of belief provides an explanation for the decrease in dopamine at the conditioned stimulus seen in overtrained animals, for the differences between trace and delay conditioning, and for transient bursts of dopamine seen at movement initiation. Because each µAgent also includes its own exponential discounting factor, the overall agent shows hyperbolic discounting, consistent with behavioral experiments

CiteSeerX